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Abstract — We consider networks where nodes differ in clock 
frequency and phase, and aim to synchronize to a master 
clock. Nodes exchange packets and are able to exchange and 
measure noisy time information. When the measurement noise is 
Gaussian, we propose two fully distributed, cooperative network 
synchronization algorithms. These algorithms are derived using 
belief propagation and mean field message passing, are shown to 
be convergent, and outperform a variety of competing algorithms. 
Our performance results are complemented by and compared 
with the relevant Bayesian Cramer-Rao bounds. 

Index Terms — Network synchronization, Belief propagation, 
Mean field, Distributed estimation, Bayesian Cramer-Rao bound. 
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I. Introduction 

SYNCHRONIZATION in wireless networks enables a 
number of important applications that rely on hard time 
constraints, such as distributed estimation or tracking JT), 12L 
time-division-multiple-access communication protocols O, 
and cooperative transmission (e.g., distributed beamforming) 
13) . These applications require a common notion of time, 
whereas each network node has its individual communication 
hardware and clock with individual timing behavior. These 
clocks are driven by oscillators, which, due to imperfections 
and environmental factors, may drift with respect to each other 
0. Hence, network nodes need to be synchronized to the 
extent required by the application. 

Network synchronization schemes differ mainly in how 
local time information is encoded, exchanged, and processed 
(6|. We will limit ourselves to the so-called packet-coupled 
synchronization, whereby local time is encoded in time stamps 
and exchanged via packet transmissions J7). Research in this 
area can be grouped into three broad fields: measurement 
protocols |7]-||9), performance bounds lfT0l - |[T4l . and practical 
synchronization algorithms (SJ, llT4l - ll2TI . Measurement pro- 
tocols can be one-way or two-way (7), and can be distorted by 
deterministic and stochastic effects J8). To reduce the effect 
of higher layers in the protocol stack, physical layer time- 
stamping can be used (9). For a collection of measurements, 
fundamental performance bounds can be derived: considering 
a pair of nodes, Cramer-Rao bounds are derived in ifTZl . 
|[T3l . later extended to Bayesian Cramer-Rao and Chapman- 
Robbins bounds iflOl . Further Cramer-Rao bounds have been 
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derived for star networks with a central master node in flTT], 
and for general networks in Ifl4l . However, these bounds do 
not consider available prior information on the unknown clock 
parameters. 

Much of the recent work on packet-coupled synchronization 
has focused on practical synchronization algorithms, which 
either steer the clock parameters towards an average value, 
or achieve a common clock value imposed by a master. 
An average value can be attained through average consensus 
and its variations lfT31 . Il22l or alternating direction method 
of multiplier consensus |[T6l . ifTTl . Alternatively, lfl8l uses 
additional constraints on the sum of relative clock parame- 
ters over cycles in the network to achieve a common clock 
value imposed by a master. A common drawback of these 
approaches is relatively slow convergence to the average 
value, especially for sparsely connected networks and in the 
presence of large outliers. In contrast, faster convergence in 
the presence of master nodes can be achieved when structure 
is imposed on the network: this can be a tree (e.g., for 
the flooding time synchronization protocol [j8]) or a cluster 
(e.g., for reference broadcast synchronization [TT9l ). Forming 
and maintaining the structure have an overhead cost, and 
make the network more sensitive to node failures. Moreover, 
nodes far away from the master exhibit degraded performance . 
Some of these drawbacks can be mitigated through centralized 
processing 04), J2TI . There is thus a clear need for network 
synchronization algorithms that combine the best features of 
all these approaches: fast convergence, distributed processing, 
and without network structure limitations. To address this, ll20l 
proposed a Bayesian algorithm based on belief propagation in 
a factor graph representation [23] of the problem, though only 
for a clock model where nodes have different clock phases. 
Unfortunately, the assumption of perfect clock frequencies is 
not reasonable for many practical applications, thus requiring 
frequent re-synchronization. 

In this paper, we build on the work from J20l . considering 
both relative clock phases and clock skews. Our contributions 
are as follows: 

« Based on a real measurement model, we derive an 
approximate, yet accurate statistical model that allows 
a Gaussian reformulation of the maximum a posteriori 
estimation of clock parameters, enabling the development 
of computationally simple algorithms; 

• We propose two fully distributed, cooperative synchro- 
nization algorithms based on message passing on a suit- 
able factor graph. We demonstrate their convergence and 
their superior convergence rates, compared to state-of- 
the-art methods; and 

• We derive a Bayesian Cramer-Rao bound (BCRB), which 
serves as a fundamental performance bound for the case 




Figure 1. Connectivity graph of a wireless network with M 
node (shaded) and A = 4 agent nodes. 



1 master 



when prior information on the local clock parameters is 

available. 
The remainder of this paper is organized as follows: In Section 
im we introduce the clock model, the network model, and 
the measurement protocol. In Section [TTTJ we provide an 
overview of the state-of-the-art on distributed synchronization, 
under both phase and skew uncertainties. Exact and simplified 
statistical models of measurement likelihoods and clock priors 
are derived in Section [IV] followed by a description of the 
BCRB in Section fVl The simplified models from Section HV1 
are used to derive two Bayesian algorithms based on message 
passing in Section [VI] In Section IVII1 we numerically study 
the properties of the proposed algorithms and compare them to 
state-of-the-art algorithms from Section [HI] Finally, we draw 
our conclusions in Section IVII1I 

II. System Model 

A. Network Model 

We consider a network comprising a set A4 of master nodes 
and a set A of agent nodes (see Fig. [TJ. The M = \A4\ > 
fully synchronous master nodes impose a common time 
reference to the network. The A = \A\ agent nodes have 
imperfect clocks that may not run synchronously with the 
reference time. Any node i e Ai U A is able to communicate 
with its neighbors N(i) in communication range from node 
i, where the communication is supposed to be bi-directional. 
All connections (i, j) are collected in the set of edges £. 
The network is assumed to be connected, i.e., there is a path 
between every pair of nodes. 

B. Clock Model 

Each network node i possesses a clock displaying local time 
Ci{t), related to the time reference (or true time) t by 



Ci(t) = ocib + fa, 



(1) 



where f3i is the clock phase of node i and on is the clock 
skew of node i (7). When i G M., an = 1 and /3, = 0. When 
i G A, both a.i and /?,; are considered as random variables. The 
clock phase j3i depends on the initial network state, and can be 
modeled with an uninformative prior (e.g., as uniformly dis- 
tributed over a large range, or, equivalently, having Gaussian 



distribution with a large variance ai t 11201 ). The clock skew 
cti depends on the quality of the clocks, typically expressed 
in parts per million (ppm), and is modeled as a Gaussian 
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Figure 2. Time instants at a reference time t of the n-th two-way timing 
message exchange between node j and node i with the local clocks Cj (t) and 

Ci(t). 



random variable J24l with mean 1 and variance a 2 a t . Note that 
nodes with more sophisticated clocks will have smaller a^ t . 
The following notation will be convenient: 9i = [eei,/3i], 

6a = [e?,ef}T, e = [ej,...,e^, e> = [im,am] t , 

0/\ T lT __j n, r^/NT 
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C. Measurement Model 

Following fl4l . pairs of nodes (i,j) 6 £ exchange data 
packets with time stamps, as shown in Fig. [2] Node i transmits 
Kij > 1 packets to node j and receives Kji > 1 from node 
j. The fc-th packet from node i is transmitted at time t\i Q and 
received by node j after a delay S\- , at time 



(k) 



t {k) +s (k) 

hj,Q + °ij ■ 



(2) 



The nodes have only access to CjfyA) and Ci(t\ ), which 
can be related to (fJJ through ([!} as 
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(3) 



for k G {1, . . . , Kij}. A similar relation holds for the packets 
sent by node j to node i, by exchanging i and j in (01. We 
will assume that transmit times Cj(^- ) are recorded exactly, 
so that the aggregate observation between nodes i and j is 
given by 
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The delay (5,- can be broken up as <5 ? - 



A,; 
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0, 



where Ay is a deterministic component (related to coding and 

(k) 

signal propagation) and w ,■ ■ is a stochastic component. We 
model Ay = T c + Tf : ij as comprising a hardware related 
computation time T c , and a time of flight T/ t y, We further 
suppose that Ay = Ajj. To model w\~,\ we have performed 
a measurement campaign with two Texas Instruments ez430- 
RF2500 evaluation boards. We placed the boards 1 meter apart 
and transmitted 10,000 packets, collecting^ the corresponding 

'Via the general debug output (GDO with GDOx_CFG = 6) of the CC2500 
transceiver chip, time of transmission and time of reception was measured. 
Note that the GDO pins reference is valid for the concept of physical layer 
time-stamping (9)- 




7.6 7.8 8 8.2 8.4 

Transmission delay in fis 

(k) 

Figure 3. Measurement data and Gaussian fit of the delay 5^ ■ = Ajj - 



, (fc) 



transmit t) , ■ n and receive times t). \ . Their measured difference 

on an oscilloscope reveals the delays S\j according to (0, 
shown as a histogram in Fig. [3] We conclude that a zero- 
mean Gaussian model for w\j is appropriate, which is con- 
gruent with the models from J7), lfT0l - |[T3l . We will denote 

(k) 9 

the variance of the measurement noise w 4 by u^, for all 

D. Network Synchronization 

Our goal is to infer the local clock parameters on and /3j 
(or an invertible transformation thereof), based on the mea- 
surements and the prior clock information. In the following 
sections, we will describe standard approaches to solve this 
problem, followed by our proposed Bayesian approach. 

III. State-of-the-Art 

In this section, we briefly present existing synchronization 
methods for the presented clock and network model. We limit 
our overview to algorithms that perform parameter estimation 
based on time stamp exchange in a fully distributed manner, 
where every node runs the same algorithm. Within this class, 
we discuss approaches based on average consensus 031 . 
alternating direction of multiplier method (ADMM) [17|, and 
loop constrained combination of pairwise estimations Ifl8l . 

A. Average TimeSync 

Consensus protocols are based on averaging information 
received from neighbors, and thus have low computational 
complexity. Moreover, master nodes are not required and, if 
present, may lead to a decrease in convergence speed. In the 
Average TimeSync (ATS) algorithm from 03J, every node has 
a virtual clock 

Ci{ci{t)) = &ia{t) + Pi — UiUit + aifii + Pi, 

which is controlled by a virtual skew on and a virtual phase $i . 
By adjusting the virtual skew and phase, ATS assures asymp- 
totic agreement on the virtual clocks lim Ci(ci(t)) = r v (t), 

t— »oo 



Vi G A, where r v (t) is a network- wide common time. The 
algorithm assumes Sy = 0. 

B. ADMM Consensus 

In ADMM consensus from ifTTl . relative skews ajj = cn/otj 
and relative phase offsets j3ij = /?.; — f3j are assumed to be 
available a priori (e.g., from an estimation algorithm such as 
Ifl2l ). Then follows a network-wide correction of the local 
clock parameters in discrete instances kAT, Collecting the 
clock skews in T/W = [a[ k} AT, ..., a^ ] AT} T , and the clock 



phases in fi( k > = [(3\ 
v( fc ) are applied as 



..,f3 A '] T , control signals u^ and 
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The computation of v\ at a node i is based on ADMM and 
requires knowledge of ctjk for all nodes j e N(z), k E N(j), 
i.e., from all two-hop neighbors. It can be shown that as 
k — > +cxd, T^ fe ) — > T ■ 1a,i f or some common value T, 
where lk,i denotes a k x I matrix with all entries equal to one. 
Finally, agreement on the clock phases is achieved through the 
control signal u^ using average consensus. Although fully 
distributed and master-free, ADDM consensus requires inner 
iterations for offset measurements and outer iterations for the 
consensus. It further relies on a step size parameter, whose 
optimal value depends on global network properties. 

C. Loop Constrained Synchronization 

In |[T8l . it was observed that for every closed loop C in the 
network, it is such that J^i jec %» = 0> f° r ^v — Pi ~ Pj 
and £ij = log(<Xi/aj). Using these constraints, the absolute 
clock values are determined via coordinate descent of the least 
squares problem |[T8l 

v = argrnin ||Av — x|j , 

V 

where A is the incidence matrix representing a directed 
topology, v the vector of absolute clock parameters (phase or 
skew) and x is the collection of offset measurements. In order 
to find a global optimum, a master node needs to be selected. 
During the iterations, local estimates on absolute skew and 
phase are exchanged with all one-hop neighbors. 

IV. Statistical Models 

The above-mentioned algorithms are all non-Bayesian, and 
thus do not fully exploit all statistical information present in 
the network. When clock skews are known, fast, distributed 
Bayesian algorithms were derived in [[201 . When clock skews 
are unknown, the naive extension of ll20l would lead to imprac- 
tical algorithms, due to the complex integrals that need to be 
computed. In this section, we propose a series of approxima- 
tions to measurement likelihoods and prior distributions, with 
the aim of a simple representation of the posterior distribution. 
Using this simplifications, the maximum a posteriori (MAP) 
estimate of the clock parameters (or a transformation thereof) 
can be found with reasonable complexity. 



A. Likelihood Function 



can be interpreted as Gaussian in the transformed parameters 



For every pair of nodes, there is a local likelihood function 6 ir As we wl11 see ln Sectlon EH thls latter observation has 
p(cy |0y; Ay), which due to the conditional independence of advantages in the algorithm design for distributed parameter 
the measurements, can be factorized as estimation since it leads to simpler computation rules. 



/r, „2 2 \~ K i3/ 2 



(4) 



exp 



j i^j 



Fj_y(0y, Aj 



x(2 7 ra J 2 ^)-^ /2 cx P 



2a 2 a 2 
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(dij , Ay ) is obtained by stacking 
Cj(ty o) — ft ) aj/ai+(3j+Aijaj, which is the deterministic 



where 

Ak) 



part from (|3}, in a column vector of length Kij . Observe that 
© is difficult to work with, since (i) the factor outside the 
exponential depends on cm and tx,; and (ii) the dependence 
on the unknown delay Ay. 

The first problem can be avoided by approximating 
(2ira 2 crf n ) ' 3 by (27rCT^) " , which leads a good ap- 
proximation of (HJi, as long as the clock skews are close to one, 
which is the physically relevant case. We deal with the second 
problem by computing the maximum likelihood estimate of 
Ay and substituting the estimate back into the likelihood. 
Taking the logarithm of (|4]i and setting the derivative with 
respect to Ay to zero leads to the following estimate 



ft 



& 



Ay (0y ) = a, \-Qj h bij hj — 



Oii 



a^ 



(5) 



where ai,a,j,bij are functions of the observations, detailed 
in Appendix [A] Substituting (0 in (J4j and considering the 
approximation of the normalization constant leads to the 
following approximate likelihood function 



p(cy|0y) ocexp 
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by] <8> ljf if +jf^,i, 



where (g) denotes the Kronecker product. The approximated 
likelihood function in (0 no longer contains the delay Ay and 



B. Prior Information 

Since our simplified likelihood function now has a Gaussian 
form in the transformed clock parameters 0^-, we need to 
select a suitable Gaussian prior so as to end up with a Gaussian 
posterior distribution. We will choose 

p(0£)=JVe{(Mp,i.Ep,i), (7) 

where Nx.(n, S) denotes a Gaussian distribution in x with 
mean vector fi and co variance matrix X. For master nodes, 
we have perfect knowledge of skew and phase, so we set 



Mp,j = 



S p ,i — 



for i G M (8) 



For agent nodes, we use the prior information on the clock 
skews from the local oscillator specification. Denoting a, = 
1 + Si with \si\ < 1, it follows that 1/a, = 1/(1 + £;) ps 
1 - £,, so E{1/oj} ps 2 - E{a 4 } = 1 and E{l/af} - 
E{l/ai} ps E {a 2 }— E{ai} = cr 2 ^. Under our assumption 
of an uninformative prior on ft, it follows that ft/a, can also 
be modeled with an uninformative prior. Hence, we find 

rl, 



Vp- 



J p,' 







'/9,i. 



for i G .4, (9) 



where <r| , t > cr 2 



C. Posterior Distribution and Estimator 

Putting together the approximate likelihood function from 
Section lTV-Al with the prior in the transformed parameters from 
Section IIV-BI we find the following posterior distribution in 
the transformed parameters 

p(0'i C )<x n p(*j) n p(c y |0^.), do) 

iGAUM (ij')e£ 

which is a Gaussian distribution in 0'. The inverse covariance 
matrix S _1 of this Gaussian turns out to be highly structured, 
with block entries (for i,j G A) 



p-' 
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for j G N(i) 
else. 
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If we are able to marginalize p(6'\c) to recover p(6[\c), we 
can compute the MAP estimate of 6[, i G A as 



0, 



argmaxp(0^|c) 

o ■ 



(12) 



arg max 
o ■ 



P(0'\c)dO{, 



where 04 indicates that the integration is over all 0' except 
0'. From 9 { we can further determine the clock parameters 
by dj = l/[0j]i and ft = [0j2/[0Ji, where [-] m extracts the 
?n-th element of a vector. Solving this problem in a distributed 
manner will be the topic of Section |VT] 
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Figure 4. Factor graph of the posterior distribution for a 5 node network 
with M = {1} and A = {2, 3, 4, 5}. 



V. Bayesian Cramer-Rao Bound 

Based on the statistical models from Section I1VI it is 
possible to derive fundamental performance bounds on the 
quality of estimators. One such bound is the BCRB, which 
gives a lower bound on the achievable estimation accuracy on 
6i 11251 . The BCRB is derived based on the Fisher information 
matrix, assuming known A,-j for every link: 



J = -E, 



e.c 



E fl 



{v e {V e [logp(0|c;A)]} T } 

{v e {V e [logp(c|0;A)]} T } 

{v e {V e [logp(0)]} T } 



E e [J,]+E e [J p ], 



(13) 



in which the matrix J p represents the contribution of the 
prior information, and is a diagonal matrix with block entries 
equal to the covariances matrices in ((9). The matrix J; repre- 
sents the contribution of the likelihood function p(c\8; A) = 
N c (fJ-i, Sj), which is the product of the pairwise functions in 
©. It is computed as 



[J/ 



dfif __i<9M; 
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00,- 



-trace 



1 dOt l 89, 



(14) 



for i,j <E A. J; has 2x2 non-zero blocks in the main diagonal 
and in i-th row and j'-th column when j G N(i). Thus, it will 
have the same structure as the inverse covariance matrix in 
(fTTl i. Additional details are provided in Appendix [B] Finally, 
the BCRB on a certain parameter, say the k-th parameter in 
the 2A-dimensional vector 6, is given by 



BRCBi 
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VL Distributed Parameter Estimation 

To solve the marginalization in ( fT2l in a distributed way, 
we use approximate inference via message passing on factor 
graphs. In the following, we describe the factor graph for 
the synchronization problem and motivate the use of message 
passing for optimum retrieval of posterior marginals. Finally, 
we derive two synchronization algorithms. 

A. Factor Graph 

The factor graph associated to the factorization in (TTOl is 
found by drawing a variable vertex for every variable (drawn as 
circles) and a factor/function vertex for every factor (drawn as 



rectangles). Vertices are connected via edges according to their 
functional dependencies. The factor graphic that corresponds to 
the connectivity graph in Fig.[T]is depicted in Fig. [4] Note that 
every variable vertex corresponds to the variables of a physical 
network node and that every factor vertex corresponds to a 
measurement link in the physical network. Thus, the structure 
of the connectivity graph is kept in the factor graph: a tree 
connectivity remains as tree factor graph, a star connectivity 
remains as star factor graph, and so on. 

Factor graphs are combined with message passing meth- 
ods in order to compute, e.g., marginal posteriors. Dif- 
ferent message passing methods lead to different perfor- 
mance/complexity trade-offs. A framework to compare mes- 
sage passing method is found through variational free energy 
minimization. 

B. Energy Minimization for Marginal Retrieval 

Our goal is to find practical methods to determine, exactly 
or approximately, the marginals from (TTZV From IJ271 , one 
strategy is to minimize the variational free energy for a 
positive function b(O') approximating p(0'\c): 

&*(•) = argmin f b(8') log ^-dfl'. (15) 

6(-) J P{V'\C) 

As algorithm designers, we can impose structure to the 
function b(6') to allow efficient solving of ( fTBI ). We will 
consider two classes of functions: (i) the Bethe method, in 
which b(6') is constrained to be a product of factors of the 
form biifii) and bij(6' i ,d' l j); and (ii) the mean field method, 
which constrains b(0') to be of the form b(0') = IIA'^)- 
Minimizing (TT~5T > subject to the constraints imposed by the 
approximations, leads to the message passing rules 11271 . The 
message passing rules turn out to be the belief propagation 
(BP) equations for class (i) and the mean field (MF) equations 
for class (ii). 

In the following we use the shorthand pij for p(cy|0 ? ' ■). 
Furthermore, since the approximated joint posterior distribu- 
tion in ([Tol l is Gaussian in 0' , we consider only messages that 
are Gaussian in 0' . 

C. Synchronization by Message Passing 

Above we motivated two message passing schemes, BP and 
MF. By applying both, we find two synchronization algorithms 
where network nodes cooperate by the exchange of messages. 
We now present the algorithms in detail, and discuss their 
salient properties. A unified view of the message passing is 
offered in Fig. [5] 

1) Belief Propagation: The BP message from a factor 
vertex p^ to a variable vertex 6^ is given by 11231 Eq. (6)] 

«A/^(Min,-y,£in,ij), (16) 

2 The representation differs slightly from the factor graph presented in 1201 . 
as in our case both nodes have access to the same function vertex since they 
share the measurements. The presentation in 1201 accounts for 2 disjoint sets 
of measurements that are not shared between the nodes 1261. 
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Figure 5. Messages between a node pair i,j of a general network. Since 
the measurements are shared, both nodes have access to the same function 
vertex. 



while the BP message from a variable vertex 6[ to a factor 
vertex pij is given by lf23l Eq. (5)] 

m e ^ Pij (e' i )= P (8' i ) U m Pik ->e>M) 

fee{N(i)\j} 



K-^e- (M ra t 1 ij) S est,»j) , 



(17) 



where we use the index "in" for intrinsic and "ext" for extrinsic 
with respect to a variable vertex. As depicted in Fig. each 
network node i corresponding to the variable vertex 9^ needs 
to compute its intrinsic and extrinsic message. Furthermore, 
note that for BP, the extrinsic message mg'^. has to be 
determined separately for every node j G N(i). For message 
computation, the parameter updates (for detailed derivations, 
see Appendix Q of (fT6l l are, if the neighboring node is an 
agent, j G N(i) n A 



.2 n V \ *3 »J 



A;,Aj- ^i^ii^-i 



^in.ijMii 



1 



2 *J U 
~Q^extjiMe 



and if the neighboring node is a master, j G N(i) n ,M 
v-i = 4-A£a 



in, i j 
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(18) 
(19) 

(20) 
(21) 



The parameter updates of (fTTt are 

X. S in,* 



E" 1 -- = E" 

ext, 2 j Pj$ 



(22) 
fee{N(i)V} 

^ext,-yMext,j.7 = ^p,i Mp,i + ^^ ^in,fci^in,fci- (23) 

fee{N(i)\j} 

The approximate marginal is obtained by 

6i(fl0ocp(fli) II "W-*{(*i) 

fcSN(i) 

ocJV fl{ (Mi,Si). (24) 

The parameters of the marginal belief (l24l are computed 
from the parameters in (l22t and (l23l . but with the additional 
summation over j. 



In the communication between two connected nodes i and 
j as in Fig. [5] node i transmits me'.-> Pi to j and vice versa. 
The receiving node then computes its intrinsic message to the 
variable vertex (e.g., node j computes m Pi ^e:)- As a node i 
has evaluated the intrinsic messages from all its neighbors, it 
can determine again its extrinsic messages. After / iterations, 
every node i computes the marginal belief bj(6'A and thereof 
the MAP estimates of its clock parameters. 

2) Mean Field: The MF message from a factor vertex p^ 
to a variable vertex 0- is given by lf28l Eq. (14)] 



i Pi ^ e >m = cx P (y log {PiMj)) bM) de ' 3 



oc Me: (jtt 



m,iji ^ u ra,l]) i 



(25) 



and the message from a variable vertex B' i to a factor vertex 
Pij is given by the belief l28l Eq. (16)] 



b i {B' i )^p{B[) ' 



>o>M) 



kGN(i) 



(26) 



The corresponding parameter updates (for detailed derivations, 
see Appendix fOT> are 



1 



-AAA, 



S in,yAVij — 2 A ij » 



-AAB..JZ, 



and 



(27) 
(28) 

(29) 



^- 1 = S~ 1 + V E-\- 

z Pj* / ^ in,fci 

fcGN(i) 

E 4 - 1 Mi =S- 4 1 ^ ii + Yl ^M^M- (30) 

feeN(j) 

For MF, two connected nodes (i,j) only need to exchange 
their beliefs instead of extrinsic information (see Fig.|5). Since 
the same information is sent to all neighbors, this can also be 
performed in a broadcast scheme. From the belief, the receiver 
then can compute the intrinsic message ( T25l l. 

3) Belief Propagation vs. Mean Field: We now discuss the 
difference between both methods with respect to convergence 
and computational complexity. 

• Convergence BP computes the true marginals if the fac- 
tor graph is a tree. If the factor graph has cycles, this can 
not be guaranteed in general. However, for joint Gaussian 
distributions, as in our case for dTOl . sufficient conditions 
for convergence exist. Furthermore, if Gaussian BP con- 
verges, it is known that it converges to the true mean 
value |29l . Popular convergence conditions are diagonal 
dominance |29l or walk-summability [30], which can be 
evaluated via the information matrix (fill . It turns out that 
diagonal dominance, while easy to checlfl is not fulfilled 
for ( [Tol l. In contrast, the walk-sum condition is difficult to 
prove for a general setting as it requires the spectral radius 

3 The absolute value of the main diagonal element in the information matrix 
of UOt must be larger then the sum of the absolute values of the remaining 
entries in the same row. 



of the block transformed information matrix. However, 
numerical evaluation revealed that the condition is always 
fulfilled for the used simulation settings. In contrast to 
BP, MF always converges BTl pp. 466]. Furthermore, MF 
converges to the true mean vector for Gaussian graphical 
models (32] pp.136]. 
• Computational complexity The BP rales are computa- 
tionally more complex, as can be observed by comparing 
(TT~8T > and (O with (f27]l and d28j. Moreover, BP requires 
the computation of an extrinsic message towards every 
single neighbor, i.e., for BP (flTt has to be computed 
for every neighbor per iteration whereas (1261 has to be 
computed only once per iteration. 

D. Scheduling and Implementation Aspects 

In this section, we discuss message scheduling and ways to 
efficiently combine timing information exchange and message 
passing in real applications. 

We consider a general topology as in Fig. |4] Every node 
i € A runs the same algorithm, and computes the mes- 
sage parameters to/from the function vertices as depicted in 
Fig. [5] Therefore, the node requires mg'.^ Pi .(0'j) to compute 
m Pi^rO'{6'i)- Together with the prior information, the node 
then computes the outgoing message m0'.^, Pi .{6' i ), which is 
sent to neighbor j for the next iteration. In order to start 
this procedure, all mgi.^ Pji (9',) in the factor graph have to 
be initialized. This can be done by setting them to uniform 
distributions, with mean value zero and infinite covariance. 

In order to improve convergence speed, we use a flooding 
schedule, where a node only propagates if it has a significant 
belief on its parameters. Hence, in the first iteration only 
master nodes m £ Ai propagate messages. In the second 
iteration, also their neighbors j £ N(m) will send messages, 
in the third their neighbors' neighbors and so on. 

Finally, our derivations were based on the assumption that 
measurements were collected first, and then message passing 
was carried out. Since both phases rely on the exchange of 
packets between nodes, it is possible to combine them, thereby 
increasing the number of measurements as message passing 
iterations progress. Such piggybacking is beneficial in real 
applications. In order to successfully start the algorithm, a min- 
imum number of measurement packets has to be exchanged 
between every node pair to provide initial timing information. 
This is due to the required matrix inversions in the message 
parameter computations. 

E. Comparison with State-of-the-Art Algorithms 

We will now describe the main similarities and differences 
of BP and MF to the previously described state-of-the-art 
algorithms from Section [HI] We will use the shorthand: ATS is 
Average TimeSync from [15], ADMM is the method proposed 
in IfTTl . and LC is the loop constraint method from |[T8l . In 
Table [I] we compare the algorithms according to the following 
criteria: if masters are required/supported ("Master support"), 
if the algorithm can work on unknown topology ("Unknown 
top."), and if the synchronization is applicable for broadcast 
protocols ("Broadcast"). Moreover, we compare the number 



Table I 
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ADMM |17| 
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MF 


Master support 
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+ 


- 


+ 


+ 


+ 
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Inf. dist. (hops) 
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1 
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Delay sensitivity 
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+ 
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+ 


Low complexity 


++ 


+ 


~ 
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of hops required to pass packets for information computation 
("Inf. dist."), if the algorithm is insensitive to propagation 
delays Sy i=- ("Delay sensitivity"), and if the algorithm 
achieves the clock parameter estimation with low computa- 
tional complexity ("Low complexity"). The last measure is 
evaluated by the runtime of the computer simulation. We grade 
the agreement to the topics with: very good (++), good(+), 
medium (~), low (— ), very low ( ). 

We note that ADMM and BP can not be used in protocols 
that rely on broadcasts, since the exchanged synchronization 
information is different for every neighbor. Additionally, in 
order to compute outgoing information, ADMM requires the 
collection of timing information from neighbors in 2 hop 
distance, as compared to 1 hop for all other algorithms. The 
number of hops is directly related to the number of packet ex- 
changes before new estimation updates can be performed. We 
also observe that BP and MF suffer from higher complexity, 
whereas they are competitive in all other respects: they are 
not sensitive to propagation delays and support an arbitrary 
number of synchronous master nodes. Between BP and MF, 
MF has the advantage that it supports broadcast protocols. 

The remaining question regarding the estimation accuracy is 
addressed in the following section, where a superior behavior 
of BP and MF is observed. 

VII. Numerical Analysis 

A. Simulation Settings 

If not specified otherwise, we use the delay and noise 
setting from the measurements in Fig. [3] In particular, the 
noise standard deviation is er m = 93 ns and the deterministic 
delay A^ = T c + Tf,ij comprises a computational time 
T c = 7.6 /is and the flight time Tf^ = dij/v, where dij 
is the distance between nodes i and j, and v is the speed 
of light. Simulations are carried out on 50 random networks 
where M = 1 master nodes and A = 9 agent nodes are 
uniformly placed in a 100 mx 100m square area. Each node 
has a communication radius of 50 m which determines the 
connectivity of the network. An example topology is shown 
in Fig. [6] We further select K^ = Kji between all node pairs, 
where a measurement from node i to node j is always followed 
with a measurement from node j to node i. The time between 
two subsequent measurements is set to 10 ms. The clock skews 
are drawn from a normal distribution corresponding to a 100 
ppm specification, on ~ A/"(l,10~ 8 ), which represents the 
prior distribution. The clock phases are drawn from a uniform 
distribution in the interval [— 10, +10]s, where the Gaussian 
prior was specified with zero mean and a standard deviation of 
&l3,i = 5.8 s. In the following we will use the root mean square 
error as performance measure, denoted by "RMSE of phase" 



and "RMSE of skew". Since not all competing algorithms rely 
on a master node, the RMSE to the true clock parameters 
is not a meaningful measure in a direct comparison. Thus, 
for algorithms not supporting master nodes, the RMSE is 
evaluated with respect to the network's mean error of skew and 
phase. Errorbars indicate the variation that arise from different 
network realizations. 
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Figure 6. Randomly connected network with M = 1 (circle) and A 
(cross). 



Figure 7. Condition for convergence: Spectral radius of transformed 
information matrix, p(I— S X _1 S T ), its variations of 50 network realizations 
and the walk-summability (WS) bound. In all cases, the criterion 13 li was 
satisfied, so convergence of BP is guaranteed. 
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B. Study of BP and MF Synchronization 

1) Convergence: Recall that MF will always converge, 
while BP is guaranteed to converge when the walk-sum 
condition l30l holds. Mathematically, this translates to 

p(I - S X1- 1 S T ) < 1, (31) 

where £ _1 is the information matrix from (flOt , S is any block 
invertible transformations, and />(•) returns the spectral radius 
of its argument. Applying an eigenvalue decomposition to 

= VAV T , 
i 

with eigenvectors V and eigenvalues on the diagonal of A, 

we use the block-diagonal transformation 

[S] M = A-r/2 V r 

We confirmed that (l3~TT l was satisfied for the following large 
parameter space: 

• Number of nodes Me [5, 100]; 



-101 



• Noise variance cr^ G [10~ 20 , 10" 

• Time between two measure mentsj T r G [0.01, 1] s; 

• Computation time T c G [0, 0.5] s; 

« Communication range r G [50, 100] m; and 

• Number of measurements i£y + Kji G [2, 620]. 

The most significant influence was observed by varying the 
number of measurements. With standard settings, this can be 
observed in Fig. [7] 



4 As response time we define T r 
k are two subsequent events. 
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Figure 8. Convergence of parameter estimates with 7 measurements. 

2) Convergence Rate: In Fig. [8j we show the BP and 
MF mean square error of the phase and skew estimates as 
a function of the iteration index. We observed that both algo- 
rithms converge after a number of iterations that correspond 
to the largest multi-hop distance of a node to a master node|f| 
Furthermore, both algorithms converge to the same values. 
The gap between estimation accuracy and BCRB arises due 
to the estimation of the transformed parameters 9' instead of 
the original parameters 6. 

From the convergence analysis we can conclude, that the 
algorithms converge for practical relevant settings in a speed 
which can be predicted by the topology. 

3) Impact of Measurements: Here, we vary the noise vari- 
ance and the number of measurements, and provide results 
after 1 = 7 message passing iterations. In both cases, we 
observed that BP and MF gave rise to very similar results. 

The variation of the number of measurements is depicted in 
Fig. [9] where the estimation results are compared to the BCRB. 
With the number of measurements the estimation accuracy 
increases and the gap to the BCRB is reduced. 

Varying the noise variance a^ in Fig. [10] reveals its linear 
dependence to the estimation accuracy in double logarithmic 
scale. Both plots can be used as design criteria for the 
synchronization system. 

5 For the topology depicted in Fig. [6] which is part of the randomly 
generated topologies, the largest multi-hop distance is 4. This observation 
corresponds with the results shown by the simulation in Fig. [8] 
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Figure 9. Variation of number of time measurements between every node Figure 11. Root MSE of phase estimates for competing methods, 
pair with noise standard deviation <r m = 9.3 10 -8 s. 
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Figure 12. Root MSE of skew estimates for competing methods. 



Figure 10. Variation of measurement noise between every node pair with 
40 measurements. 



C. Comparison to other Algorithms 

We now compare BP and MF to other fully distributed state- 
of-the-art algorithms^ from Section|III] Since in those methods 
time information and synchronization information exchange is 
interleaved, we apply the concept of piggybacking, introduced 
in Section lVI-DI Furthermore, for a fair comparison, we set the 
computational delay T c = s, since not all methods account 
for deterministic delays between the nodes. Thus, the delay 
reduces to the time of flight, which is in the order of tenths 
of microseconds. 

In Fig.Q~T]and Fig- El simulation results for phase and skew 
estimation are shown. The simulations were performed on the 
fixed network topology as depicted in Fig.|6]and the results are 
averaged over 50 runs. For both phase and skew estimation, 
the proposed algorithms outperform the competing ones for 

6 The following algorithm parameters are selected for 11151 : filter values 
P-q = pa = Po = 0.6; for 1171 : step size e pi according to 1171 Eq. (12)], 



relative skew estimation as in 1121 with Kij 
parameter A = 0.9. 



Kji = 5; and for (Tf): filter 



a fixed number of exchanged packets. Despite having higher 
complexity, we can conclude that the proposed algorithms 
have superior estimation performance and convergence speed 
compared to the other methods. 

VIII. Conclusions 

In this paper, we presented two cooperative and fully 
distributed network synchronization algorithms, which can be 
utilized when the measurement noise is (approximately) Gaus- 
sian. Using standard communication hardware, this approxi- 
mation was verified by measurements. The synchronization 
algorithm design is based on message passing in a factor graph 
representation of the statistical model. Belief propagation (BP) 
and mean field (MF) message passing were applied to perform 
MAP estimation of the local clock parameters. We studied 
convergence, convergence rate, and accuracy, and found that 
in all three criteria, BP and MF are able to outperform existing 
algorithms. However, our results are in some cases still far 
away from a newly derived Bayesian CRB, indicating the 
possibility for further improvement. 
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Appendix 
A. ML-estimate of the delay 

The ML estimate of Ay is given by 



Ay = argmaxlogp(cy|0y; Ay), 



where 



block entries are 



Jflli.i — T" 



1 ^ 



'Jjt.J 



2 Z^ 
fe=l 



<l>i,AK) 1 
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l^Jil^-Ji 



logp ( Cy|0y,Ay) OC - W^i^^M 

\\c j ^ i -F j .+ i (e ij ,A ij )f 

Since F.j_y (0y, Ay) is linear in Ay, taking the derivative of 
logp(cy \0ij ; Ay) with respect to Ay and equating the result 
to zero, immediately yields 



Ay (0y) 



o i 



Kii 


Ci,ij 


J* Cjj'2 


1 




^J 


+*ii 


cm 




-#ii 




c j,ji 






Kij + Kji 





for j G N(i) and [Jj]t,j = else. 

2) Expectations of Kg[Ji] and Ee[3. p ]: The expectations 
Ee[J(] and Ee[J p ] have to be taken over the inverse clock 
skews, i.e. Eg[l/a™], for n up to 4. Since the clock skews are 
Gaussian distributed and close to one, we use the approxima- 
tions 

E{l/oi}tt2-n a ,i E{l/ a 2} ^a 2 ati + E{l/ ai } 2 

E{l/a?} ^E{l/ flj } 3 + 3E{l/a,}4 



E {1/af } « E {1/a,} 4 + 6 E {1/a,} 2 o*^ + 3 a 



j 4 . 

Q.Z 



C Belief Propagation Update Rules 

Derivation of the message parameters for m Pi ,^g{(Qi) in 



A', 



Kij /3i 



Ki 



K 3l ft 



Ki 



K^ a t 



K, 



Kji a. 



(32) 



with the averaged time stamps c.^y = 1/ify Y^ k Ci(t\, ' ), 
Ciji = l/Kji J2 Ci(tji tl ) of i, and djji 

Cj,io = V-^yEfcCj(%i) of i- 



Wi £i <*(*#(>), 



B. Computation of Fischer Information Matrix 

1) Computation of J i: From the true likelihood © we have 
the parameter set £j T 1 , £t; for every connected node pair (j, j) 

as 

f-1,%3 = [ F *^j (fiij i Ay ) T F j^i ( e u i Ay ) T ] 



m py ^(^) = y P y(0^) m g ,^ ptj (9' 3 )d9' 3 
« /exp (_-L||Ay^ + By0;.|[ 2 ) x 

fGi(0Q\ /■ ( GiMSl \ An' 
= exp ^-^-) y exp ^ ^ j dflj 

{Gd0i)\ 

ex exp 



ocA/e; (/x in ^,S ini y) . 

The functions G;(0Q and Gy (#£,#'■) are given by the expo- 
nent 



J i,y 











where I*,,/ denotes a fc x / identity matrix. Applying (fl4l > leads 
to the symmetric main diagonal block entries 



("j Mextj'i) ^extj'i I "j Mextji) 

Gi{0i) + Gij(9 i ,0j) 



with 
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where 0y,!([^] 2 ) = q(^) - [«J] a and ^y, 2 (^) = 



pie 



Ay for any pair (i,j) € f. The off-diagonal 



— f)' T A T R —V'V^ 1 ii 
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and 



S ' - ~2~ ( B »j B « + ^m^extji) 
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If the neighbor j is a master node, i.e., 'S ext - = diag(oo, oo), 
the parameters reduce to 

y; -1 — a t A- 

For the parameters in (fTTt we utilized the generic formula 
for multiplying Gaussian normal distributions: 



l-/Vx(/ii > £ i )ocJV r x (/' ) £) 



with 5T 1 = J2 t S- 1 and S- V = E, E, r M 



(33) 



D. Mean Field Update Rules 

Derivation of the message parameters for m Pi ._ i .g>.(0i) in 



hv-urM) = ex P (/ lo S (PiMj)) bjty) W, 

ocexpf --L /"||Ay^ + By^|| 2 X 



with the parameters (l27t and (|28). The message parameters of 
are derived equivalently to (1331) . 
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